Application No. 10/663,390 

Amendment in Response to Office Action Dated January 15, 2009 

AMENDMENTS TO THE CLAIMS 

The following Listing of Claims, with amendments to claims 1 , and 8, will replace all 
prior versions, and listings, of claims in the application. Note that claims 22-50 were 
withdrawn in a prior amendment and remain withdrawn at this time. 

1 (Currently Amended). A system for providing adaptive playback of an audio 
signal received across a packet based network, comprising: 

storing data packets comprising a received audio data signal to a signal buffer; 
outputting parts of the signal present in the signal buffer as needed for signal 
playback; 

analyzing the data packets contained in the signal buffer to determine whether any 
data packets are missing, having not been received into the signal buffer by an oxpoctod 
arr i val timo, oo i d oxpoctod arr i va l t i me roprooont i ng a predetermined packet late loss time; 

specifying a maximum delay period, extending past the predetermined packet late 
loss time oxp i rat i on of tho oxpootod arr i va l t i mo . for receiving any missing data packets; 

following the predetermined packet late loss time oxp i rat i on of tho oxpoctod arr i va l 
time, stretching at least part of the signal preceding the missing data packets present in 
the signal buffer, until any of receiving the missing data packets and exceeding the 
maximum delay period, when the analysis of the contents of the signal buffer indicates that 
the length of the signal in the signal buffer is less than a predetermined threshold; and 

compressing at least part of the signal present in the signal buffer when the analysis 
of the contents of the signal buffer indicates that the length of the signal in the signal buffer 
is greater than a predetermined threshold. 

2 (Original). The system of claim 1 wherein analyzing the contents of the signal 
buffer includes determining a type of the contents of the signal buffer from among a group 
including: periodic content, quasi-periodic content, aperiodic content and mixed content. 

3 (Original). The system of claim 2 wherein stretching at least part of the signal 
having any of periodic content and quasi-periodic content type comprises: 
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identifying at least one of the segment of the content of the signal buffer as a 

template; 

searching for a matching segment in portions of the content of the signal buffer 
whose cross correlation peak exceeds a predetermined threshold; and 

inserting the template into the content of the signal buffer, and aligning and merging 
the matching segments. 

4 (Original). The system of claim 2 wherein stretching at least part of the signal 
having aperiodic content the type comprises automatically generating and inserting at least 
one synthetic segment into the buffered signal to increase the length of the content of the 
signal buffer. 

5 (Original). The system of claim 4 wherein automatically generating the at least 
one synthetic segment comprises: 

automatically computing the FFT of the at least part of the signal; 
introducing a random rotation of the phase into the FFT coefficients; and 
computing the inverse FFT for each segment, thereby creating the at least one 
synthetic segment. 

6 (Original). The system of claim 4 wherein automatically generating the at least 
one synthetic segment comprises: 

applying at least one LPC filter to the at least part of the signal to compute an LPC 

residual; 

computing at least one FFT from the LPC residual; 

introducing a random rotation of the phase into the coefficients of at least one of the 
computed FFTs; 

computing inverse FFTs from the FFT coefficients to reconstruct the LPC residual; 

and 

applying at least one inverse LPC filter to the LPC residual, thereby creating the at 
least one synthetic segment. 
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7 (Original). The system of claim 1 wherein the predetermined threshold for 

stretching and compressing at least part of the signal present in the signal buffer are 
optimized to compensate for clock drift between an encoder and a decoder. 

8 (Currently Amended). A system for providing an adaptive playback of received 
frames of an audio signal transmitted across a packet-based network, comprising: 

receiving and decoding data frames of an audio signal transmitted across a packet- 
based network; 

storing the decoded data frames to a signal buffer; 

analyzing the contents of the signal buffer to determine whether any data frames 
are missing due to corresponding data packets having not been received by an oxpoctod 
arrival t i me, oa i d oxpoctod arr i va l t i me roprooont i ng a predetermined packet late loss time; 

specifying a maximum delay period, extending past the exp i rat i on of the oxpootod 
arr i va l t i mo predetermined packet late loss time , for receiving any missing data packets; 

outputting one or more of the decoded frames present in the signal buffer when the 
analysis of the contents of the signal buffer indicates that the length of the signal in the 
signal buffer is between a predetermined minimum and a predetermined maximum buffer 
size; 

following the oxp i rat i on of the oxpoctod arr i va l t i mo predetermined packet late loss 
time, stretching and outputting one or more decoded frames preceding the missing data 
packets in the signal buffer, until any of receiving the missing data packets and exceeding 
the maximum delay period, when the analysis of the contents of the signal buffer indicates 
that the length of the decoded frames in the signal buffer is less than the predetermined 
minimum buffer size; and 

compressing and outputting one or more decoded frames in the signal buffer when 
the analysis of the contents of the signal buffer indicates that the length of the decoded 
frames in the signal buffer is greater than the predetermined maximum buffer size. 

9 (Original). The system of claim 8 wherein any frame output from the signal buffer 
is removed from the signal buffer as it is output. 
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10 (Original). The system of claim 8 further comprising packet loss concealment for 
signal packets declared to be late loss packets. 

1 1 (Original). The system of claim 8 wherein stretching and outputting one or more 
decoded frames provides automatic jitter control as a function of buffer content. 

12 (Original). The system of claim 1 1 wherein stretching one or more decoded 
frames further comprises automatically determining a content type of the stretched frames 
prior to stretching those frames. 

13 (Original). The system of claim 12 wherein the content type includes any of 
voiced framed, unvoiced frames, and mixed frames. 

14 (Original). The system of claim 13 wherein stretching any voiced frame 
comprises: 

identifying at least one of the segment of the voiced frame as a template; 
searching for a matching segment in adjacent frames whose cross correlation peak 
exceeds a predetermined threshold; and 

aligning and merging the matching segments of the frame. 

15 (Original). The system of claim 8 wherein stretching any unvoiced frame 
comprises automatically generating and inserting at least one synthetic segment into the 
current frame to increase a length of the current frame. 

16 (Original). The system of claim 15 wherein automatically generating the at least 
one synthetic segment comprises: 

automatically computing the FFT of the current frame; 
introducing a random rotation of the phase into the FFT coefficients; and 
computing the inverse FFT for each segment, thereby creating the at least one 
synthetic segment. 
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17 (Original). The system of claim 15 wherein automatically generating the at least 
one synthetic segment comprises: 

applying at least one LPC filter to the current frame to compute an LPC residual; 
computing at least one FFT from the LPC residual; 

introducing a random rotation of the phase into the coefficients of at least one of the 
computed FFTs; 

computing inverse FFTs from the FFT coefficients to reconstruct the LPC residual; 

and 

applying at least one inverse LPC filter to the LPC residual, thereby creating the at 
least one synthetic segment. 

18 (Original). The system of claim 8 wherein stretching any mixed frame comprises: 
identifying at least one segment of the frame as a template; 

searching for a matching segment whose cross correlation peak exceeds a 
predetermined threshold; 

aligning and merging the matching segments of the frame to create an interim 
voiced segment; 

automatically generating and inserting at least one synthetic segment into the 
current frame to create an interim unvoiced segment; 

weighting each of the interim voiced segment and the interim unvoiced segment 
relative to a normalized cross correlation peak computed for the current segment; and 

adding and windowing the interim voiced segment and the interim unvoiced 
segment to create a partially synthetic stretched segment. 

19 (Original). The system of claim 8 wherein compressing any voiced frame 
comprises: 

identifying at least one segment of the frame as a template; 
searching for a matching segment whose cross correlation peak exceeds a 
predetermined threshold; 

cutting out the signal between the template and the match; and 
aligning and merging the matching segments of the frame. 
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20 (Original). The system of claim 8 wherein compressing any voiced frame 

comprises: 

shifting a segment of the frame from a first position in the frame to a second position 
in the frame; 

deleting the portion of the frame between the first position and the second position; 

and 

adding the shifted segment of the frame to the signal representing the remainder of 
the frame by using a sine windowing function for blending the edges of the segment with 
the signal representing the remainder of the frame. 

21 . The system of claim 8 wherein both the predetermined minimum buffer size 
for stretching one or more decoded frames in the signal buffer and the predetermined 
maximum buffer size for compressing one or more decoded frames in the signal buffer are 
optimized to compensate for clock drift between an encoder and a decoder. 

22 (Withdrawn). A method for adaptive playbacl< of received frames of an audio 
signal transmitted across a packet-based network, comprising using a computing device 
to: 

receive a packetized audio signal broadcast across a packet-based network; 
decode each received packet and store the resulting decoded signal frame in a 
signal buffer; 

output a current packet in the case where the current packet has been received 
across the packet-based network; 

instantiate a mute mode whereby a playback of the audio signal is at least partially 
muted when a maximum delay time for receiving the current packet has been exceeded, 
and the current packet has not been received; 

instantiate a packet loss concealment mode whereby the playback of the audio 
signal is modified for reducing audible artifacts resulting from one or more lost packets 
when a current buffer content has been previously temporally stretched, the current packet 
has not yet been received, and a packet subsequent to the current packet has already 
been received. 
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23 (Withdrawn). The method of claim 22 further comprising analyzing the 
content of the signal buffer for determining a current length of the contents of the signal 

buffer. 

24 (Withdrawn). The method of claim 23 further comprising stretching and 
outputting one or more decoded frames from the signal buffer when the current length of 
the contents of the signal buffer is less than a predetermined minimum buffer size. 

25 (Withdrawn). The method of claim 24 wherein the predetermined minimum 
buffer size is optimized to compensate for clock drift between an encoder and a decoder. 

26 (Withdrawn). The method of claim 23 further comprising compressing and 
outputting one or more decoded frames from the signal buffer when the current length of 
the contents of the signal buffer is greater than a predetermined maximum buffer size. 

27 (Withdrawn). The method of claim 24 wherein the predetermined maximum 
buffer size Is optimized to compensate for clock drift between an encoder and a decoder. 

28 (Withdrawn). The method of claim 22 wherein modification of the playback of 
the audio signal is in the packet loss concealment mode comprises: 

computing an average energy for a frame in the signal buffer immediately preceding 
the current packet that has not yet been received; 

computing an average energy for a frame In the signal buffer Immediately 
succeeding the current packet that has not yet been received; and 

determining a target frame size for both the preceding and succeeding frames as a 
function of the ratio of the of the average energy of the succeeding frame to the preceding 
frame. 

29 (Withdrawn). The method of claim 28 wherein determining a target frame 
size for both the preceding and succeeding frames further comprises stretching the 
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succeeding frame and the preceding frames by an amount tliat is inversely proportional to 
the ratio of the average energy. 

30 (Withdrawn). The method of claim 29 wherein instantiating the mute mode 
comprises generating and providing playback of a comfort noise signal to replace lost 
packets, said comfort noise signal being generated from at least one signal frame stored in 
a silence buffer, said signal frame having been determined to represent nominal 
background noise. 

31 (Withdrawn). The method of claim 30 further comprising periodically 
replacing the signal frames in the silence buffer as a function of a computed energy of 
those frames. 

32 (Withdrawn). The method of claim 30 wherein generating the comfort noise 
signal from the at least one signal frame stored in a silence buffer comprises: 

automatically computing the FFT of the at least one signal frame stored in the 
silence buffer; 

introducing a random rotation of the phase into the FFT coefficients; 
computing the inverse FFT for each segment, thereby creating the at least one 
synthetic silence segment; and 

providing the at least one silence segment for playback as the comfort noise signal. 

33 (Withdrawn). A computer-readable medium having computer executable 
instructions for providing adaptive decoding and playback of a packetized audio signal, 
said computer executable instructions comprising: 

receiving a plurality of network packets, said network packets representing a 
packetized audio signal; 

decoding each network packet as it is received and storing the decoded packet as a 
signal frame in a signal buffer; 
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estimating an LPC filter for each signal frame, computing an LPC residual from 
each signal frame using the estimated LPC filter, and storing each LPC residual in an LPC 

residual buffer; 

examining a current length of the LPC residual buffer; 

stretching and outputting a current LPC residual from the LPC residual buffer when 
the current length of the LPC residual buffer is less than a predetermined minimum buffer 
size; and 

computing an inverse LPC of the stretched LPC residual, and outputting the result 
as a current signal frame. 

34 (Withdrawn). The computer-readable medium of claim 33 wherein the 
predetermined minimum buffer size is optimized to compensate for clock drift between an 
encoder and a decoder. 

35 (Withdrawn). The computer-readable medium of claim 33 further comprising: 
compressing and outputting a current LPC residual from the LPC residual buffer 

when the current length of the LPC residual buffer is greater than a predetermined 
maximum buffer size; and 

computing an inverse LPC of the compressed LPC residual, and outputting the 
result as a current signal frame. 

36 (Withdrawn). The computer-readable medium of claim 35 wherein the 
predetermined maximum buffer size is optimized to compensate for clock drift between an 
encoder and a decoder. 

37 (Withdrawn). The computer-readable medium of claim 33 further comprising 
instantiating a mute mode whereby a playback of the audio signal is at least partially 
muted in the case where a maximum delay time for receiving a current packet has been 
exceeded, and the current packet has not been received. 
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38 (Withdrawn). Tine computer-readable medium of claim 33 further comprising 
instantiating a packet loss concealment mode whereby a playback of the audio signal is 
modified for reducing audible artifacts resulting from one or more lost packets in the case 
where a current LPC residual buffer content has been previously stretched, a current 
packet has not yet been received, and a packet subsequent to the current packet has 
already been received. 

39 (Withdrawn). A method for providing adaptive signal playback, comprising 
using a computing device to: 

receive signal packets representing a digitized audio signal transmitted across a 
packet-based network; 

decode the packets to reconstruct the digitized audio signal; 

store the reconstructed digitized audio signal in a signal buffer; 

provide content of the signal buffer for playback as required by a playback device; 

begin stretching contents of the signal buffer when an expected signal packet has 
not been received at an expected time; and 

continue stretching contents of the signal buffer until a condition selected from (1) 
actual receipt of the expected signal packet, and (2) a determination that the expected 
signal packet is lost. 

40 (Withdrawn). The method of claim 39 wherein the determination that the 
expected signal packet is lost is a function of the amount of stretching already applied to 
the contents of the signal buffer, receipt of one or more subsequent expected signal 
packets, and existing content of the signal buffer. 

41 (Withdrawn). The method of claim 39 further comprising muting playback of 
the audio signal when a predetermined delay time has been exceeded without receiving 
any signal packets. 
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42 (Withdrawn). The method of claim 39 further comprising stretching contents 
of the signal buffer when the length of the contents in the signal buffer is less than a 
predetermined threshold. 

43 (Withdrawn). The method of claim 42 wherein the predetermined threshold is 
optimized to compensate for clock drift between an encoder and a decoder. 

44 (Withdrawn). The method of claim 39 further comprising compressing 
contents of the signal buffer when the length of the contents in the signal buffer exceeds a 
predetermined threshold. 

45 (Withdrawn). The method of claim 44 wherein the predetermined threshold is 
optimized to compensate for clock drift between an encoder and a decoder. 

46 (Withdrawn). The method of claim 39 further comprising removing content 
from the signal buffer as it is provided for playback as required by a playback device. 

47 (Withdrawn). The method of claim 39 further comprises analyzing contents of 
the signal buffer to determine a content type of at least part of the contents of the signal 
buffer. 

48 (Withdrawn). The method of claim 39 wherein the content type is quasi- 
periodic, and wherein stretching contents of the signal buffer comprises: 

identifying at least one of the segment of the voiced frame as a template; 
searching for a matching segment in adjacent frames whose cross correlation peak 
exceeds a predetermined threshold; and 

aligning and merging the matching segments of the frame. 

49 (Withdrawn). The method of claim 39 wherein the content type is aperiodic, 
and wherein stretching contents of the signal buffer comprises: 
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computing at least one FFT from at least one part of the contents of the signal 

buffer; 

randomizing a phase rotation of the coefficients of at least one of the computed 

FFTs; 

computing an inverse FFT from the coefficients for each FFT to synthesize a signal 
segment corresponding to each computed FFT; and 

stretch at least part of the contents of the signal buffer by inserting each 
synthesized signal segment into the buffered audio signal. 

50 (Withdrawn). The method of claim 49 further comprising: 
applying an estimated LPC filter to the contents of the signal buffer to compute an 
LPC residual for use in place of the contents of the signal buffer for computing the at least 
one FFT from at least one part of the contents of the signal buffer; and 

applying an interpolated inverse LPC filter to the signal segment corresponding to 
each computed FFT prior to stretching at least part of the contents of the signal buffer by 
inserting each synthesized signal segment into the buffered audio signal. 
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